Bayesian audio source separation
نویسنده
چکیده
In this chapter we describe a Bayesian approach to audio source separation. The approach relies on probabilistic modeling of sound sources as (sparse) linear combinations of atoms from a dictionary and Markov chain Monte Carlo (MCMC) inference. Several prior distributions are considered for the source expansion coefficients. We first consider independent and identically distributed (iid) general priors with two choices of distributions. The first one is the Student t, which is a good model for sparsity when the shape parameter has a low value. The second one is a hierarchical mixture distribution; conditionally upon an indicator variable, one coefficient is either set to zero or given a normal distribution, whose variance is in turn given an inverted-Gamma distribution. Then, we consider more audio-specific models where both the identically distributed and independently distributed assumptions are lifted. Using a Modified Discrete Cosine Transform (MDCT) dictionary, a time-frequency orthonormal basis, we describe frequency-dependent structured priors which explicitly model the harmonic structure of sound, using a Markov hierarchical modeling of the expansion coefficients. Separation results are given for a stereophonic recording of 3 sources.
منابع مشابه
Bayesian group sparse learning for music source separation
Nonnegative matrix factorization (NMF) is developed for parts-based representation of nonnegative signals with the sparseness constraint. The signals are adequately represented by a set of basis vectors and the corresponding weight parameters. NMF has been successfully applied for blind source separation and many other signal processing systems. Typically, controlling the degree of sparseness a...
متن کاملA Bayesian Approach to Time-frequency Based Blind Source Separation
In this paper we propose a bayesian approach for time-frequency (t-f) based source separation. We propose a Gibbs sampler, a standard Markov Chain Monte Carlo (MCMC) simulation method, to sample from the mixing matrix, the source t-f coefficients and the input noise variance, under two models for the sources. In the first one the t-f coefficients of the sources are assumed i.i.d, while a freque...
متن کاملOn the Use of Latent Mixing Filters in Audio Source Separation
In this paper, we consider the underdetermined convolutive audio source separation (UCASS) problem. In the STFT domain, we consider both source signals and mixing filters as latent random variables, and we propose to estimate each source image, i.e. each individual sourcefilter product, by its posterior mean. Although, this is a quite straightforward application of the Bayesian estimation theor...
متن کاملVariational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation
The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduce...
متن کاملNonparametric Bayesian sparse factor analysis for frequency domain blind source separation without permutation ambiguity
Blind source separation (BSS) and sound activity detection (SAD) from a sound source mixture with minimum prior information are two major requirements for computational auditory scene analysis that recognizes auditory events in many environments. In daily environments, BSS suffers from many problems such as reverberation, a permutation problem in frequency-domain processing, and uncertainty abo...
متن کامل